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ABSTRACT 

The extension of SPARQL in version 1.1 with property paths 
offers a type of regular path query for RDF graph databases. 
Such queries are difficult to optimize and evaluate efficiently, 
however. We have embarked on a project, Waveguide, to 
build a cost-based optimizer for SPARQL queries with prop¬ 
erty paths. Waveguide builds a query plan—a waveguide 
plan (WGP)—which guides the query evaluation. There are 
numerous choices in the construction of a plan, and a num¬ 
ber of optimization methods, meaning the space of plans for 
a query can be quite large. Execution costs of plans for the 
same query can vary by orders of magnitude. We illustrate 
the types of optimizations this approach affords and the per¬ 
formance gains that can be obtained. A WGP’s costs can be 
estimated, which opens the way to cost-based optimization. 

1. INTRODUCTION 

Graph data is becoming rapidly prevalent with the rise of 
the Semantic Web, social networks, and data-driven explo¬ 
ration in life sciences. There is need for natural and efficient 
ways to query over these graphs. 

The Resource Description Framework (RDF) [19] pro¬ 
vides a data model for graph data. An RDF store is a 
set of triples that describes a directed, edge-labeled multi¬ 
graph. A triple, (s, r,o), denotes an edge from node “s” (the 
subjeet) to node “o” (the ohjeet)^ with the edge labeled by 
“r” (the ro/e, also called as the label or as the predieate)} 

Correspondingly, the SPARQL query language [18] pro¬ 
vides a formal means to query over RDF stores. A query 
defines sub-graph match criteria; its evaluation over an RDF 
store returns all embedded sub-graphs meeting the criteria. 
For example, the query “?friend ffriendOf Charles” evaluates 
to a list people (nodes, binding to variable “?friend”) who 
are friends of (role “ffriendOf”) “Charles” (a named node, so 
a constant). This is a simple query, of course, and could be 

^The object in an RDF triple is allowed to be a literal as 
well as a node. However, this distinction is not important 
for us. 


evaluated just by extracting the triples with “r = ffriendOf” 
and “o = Charles”. For even an only slightly more compli¬ 
cated query, however, it may not be straightforward to find 
a plan to evaluate it efficiently. 

In its current version, 1.1, SPARQL’s expressiveness has 
been extended with property paths [11]. Instead of specify¬ 
ing the path of interest explieitly between nodes, one may 
now specify it implieitly via a regular expression. (This 
also means matching paths in the graph are not bounded in 
length by the query’s expression, while they are in SPARQL 
1.0). For example, the query “?friend ffriendOf+ Charles” 
evaluates to a list people who are friends of “Charles”, or 
friends of people who are friends of “Charles”, and so forth.^ 

Property paths effectively introduce the concept of reg¬ 
ular path queries (RPQs)—well studied before the advent 
of RDF and SPARQL—into the query language. While 
eminently useful, such queries are even more challenging to 
optimize well. We have embarked on a long-term project 
called Waveguide with the ultimate goal to provide viable 
cost-based query optimization and evaluation for SPARQL 
over RDF stores that is on par with the state of the art for 
relational database systems. 

We address the critical first step of this endeavor, defining 
a rich plan spaee —the space of waveguide plans (WGPs)— 
for SPARQL queries. We focus on single-path, property- 
path queries, essentially the RPQ fragment of SPARQL 1.1. 
We consider a set semantics—the distinct directive in each 
query—and thus do not consider aggregation. Contributions 
of this work are as follows. 

1. Waveguide-plan spaee. 

(a) Summarize the state of the art for evaluation of 
RPQs and SPARQL property paths (§2). Estab¬ 
lish why none suffices (§2.4). 

(b) Devise the waveguide place space (§3). Demon¬ 
strate it subsumes the state of the art, and ex¬ 
tends well beyond it (§3.5). 

(c) Model the eost faetors that determine the effi¬ 
ciency of plans (§4). Present the powerful opti¬ 
mizations offered by waveguide plans (§4.3). 

2. Performanee study. 

(a) Provide an evaluation framework (§5.1). 

(b) Benehmark query plans for realistic queries over 
real RDF stores / graphs (§5). Substantiate the 
optimizations of our approach (§5.3, §5.4, & §5.5). 

(c) Justify the neeessity of planning and the wave¬ 
guide plan space (§5). 


^ “ffriendOf+” represents the transitive elosure over edges 
labeled as “ffriendOf”. 
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A waveguide plan consists of a collection of (non- 
deterministic) finite automata for the property path and 
search directives which guides the query evaluation. In 
[22], we demonstrated that, with proper choice of plan, we 
can gain orders of magnitude performance improvement for 
many property-path queries over real datasets, while main¬ 
taining comparable performance for other queries, as the 
leading SPARQL query engines as Jena [12] and Virtu¬ 
oso [9]. We evince that planning is critical to evaluate 
SPARQL queries efficiently, and that choosing the right 
plan depends on the underlying graph data and thus ul¬ 
timately must be cost-based. 

2. BACKGROUND & RELATED WORK 

In §2.1, we provide relevant background on path queries. 
The literature on path queries over graphs, as is pertinent 
to property paths, comes from two distinct sources: 

1. work on regular path queries (RPQs); and 

2. work on SPARQL platforms to extend to version 1.1 
to handle property paths. 

Research on RPQs, which well precedes RDF and 
SPARQL, mostly focused on theoretical aspects, but little 
on performance issues for evaluating such queries in prac¬ 
tice.^ The seminal work that introduced the G+ query 
language [16] exploited the natural observation that where 
there is a regular expression, there is definite automaton (FA) 
that is a reeognizer for it. They showed how to use finite 
state machines to direct search over the graph to evaluate a 
RPQ. In essence, an FA corresponding to the query’s regu¬ 
lar expression provides a plan for its evaluation. Subsequent 
work on RPQs followed on this idea. Let us call this the FA 
approach. We overview this approach in §2.2. 

Work on evaluating property paths—much newer by 
virtue of the fact that the SPARQL 1.1 standard is quite 
recent—meanwhile has mirrored the dynamic-programming 
approach behind the algorithm presented in the seminal 
work of [15]. This can be modeled by an extended relational 
algebra (RA) that includes an operator a for transitive elo- 
sure (a-RA) [3]. Let us call this the a-RA approach. 

As with the FA approach for RPQs, a-RA suffices for eval¬ 
uating property paths. The full power of relational algebra, 
as extended with a, can then be employed to devise an eval¬ 
uation plan—an a-RA-expression tree—based on the regular 
expression of the property path. This general approach is 
found behind many SPARQL platforms, as it follows rela¬ 
tional techniques well. For example, VIRTUOSO [9], a leading 
SPARQL system which is also a well-established relational 
database system, extended their platform to accommodate 
property paths by adding an “a” operator to the engine. We 
present this approach and characterize it by a-RA in §2.3. 

Work on property-path evaluation has been remiss in not 
drawing the connection to RPQs. How do the FA and a-RA 
approaches compare? Does one subsume the other? Or are 
they ineomparahlel The latter is, in fact, the case, and we 
show this in §2.4. Furthermore, a combined approach might 
be superior. We show that it is in §3. 

Both the FA and a-RA approaches effectively provide eval¬ 
uation plans for property-path queries. However, the plan 

^Regular-path queries have been considered under both sim¬ 
ple- and arbitrary-paih semantics. Under simple-path se¬ 
mantics, a path in the graph to match must not repeat any 
nodes; under arbitrary-path semantics, they may. SPARQL 
adopts arbitrary-paih semantics, for the sake of tract ability. 


en:Gundam en:Tokyo en:Japan 



Figure 1: An example graph database. 


spaees that are implicit in these approaches have not been 
considered. In FA, choosing a different (but still correct) 
automaton for the plan might offer a significantly more ef¬ 
ficient plan. In systems taking the a-RA approach, limited 
planning is sometimes done, but not in a formal way to enu¬ 
merate through the plan space to find a best estimated plan, 
as is done in relational systems. We address this in §4. 

2.1 Path Queries on Graphs 

A graph database G can be defined as (V, Y, E) for which 
V is a finite set of nodes (vertices), Y is a finite alphabet 
(a set of labels), and U is a set of direeted, labeled edges, 
E C N xE X N. 

A path in a graph is defined as a sequence p = noao- 
ni. . .Uk-iak-iUk such that Ui G N, for 0 < i < k, and 
{ni,ai,ni-^i) G U, for 0 < z < /c. The path-induced path 
label X{p) is the string aia 2 .. .a/cG Y* (for which Y* is a set 
of all finite strings formed over Y). Each node n G Y is 
associated with an empty path, n, the path label of which is 
the empty string, denoted by e. 

A regular expression over alphabet Y is defined induc¬ 
tively, as follows: 1. the empty string e and each symbol 
r G Y; and, 2. given regular expressions r, ri, and r 2 , then 
(a) the concatenation rir 2 , (b) the disjunction ri|r 2 , and (c) 
Kleene star r*. The regular language defined by the regular 
expression r is denoted by L(r). Regular language is defined 
inductively, as follows: 1. L(e) = {e} and L(a) = {a}, for 
each a G Y; and, 2. for inductively combining strings, (a) 
L(rir2) = L(ri) • L(r2), (b) L(ri\r2) = L(ri) U L(r 2 ), and 
(c) L{r*) = {e} U IJSi L{ry. 

A regular path query Q is a tuple {x, r, y) for which x and 
y are free variables (that range over nodes) and r is a regular 
expression. An answer to Q over graph G = {N, E, E) is a 
pair (s, t) G Y X Y such that there exists an arbitrary path 
p from node s to node t for which the path label X(p) is 
in language L{r) {X{p) G L{r)). The answer set of Q over 
graph G is the set of all answers of Q over G. 

Regular path queries have been considered since semi- 
struetured data models were first introduced [2, 16]. The 
complexity of RPQs for graph databases particularly has 
been well studied [4,5]. In [14], the idea of employing NFAs 
to guide search for RPQ evaluation appears. In [10], they 
perform a fixpoint evaluation for property paths. In [21], 
we present a precursor of Waveguide that explores fixpoint 
evaluation for property paths using SQL recursion. 

Regular path queries provide a useful mechanism for 
querying data in many application domains. For example, 
consider the knowledge base dataset of the Linked Open 
Data (LOD) cloud. LOD is a community effort which aims 
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to interlink the structural information available in various 
datasets on the Web (such as Wikipedia, WordNet, and oth¬ 
ers), and make it available as a single RDF graph. 

RPQs prove useful in querying such linked data by provid¬ 
ing a convenient declarative mechanism which can be used 
to answer queries without prior knowledge of the underlying 
data paths. 


:sameAs :sameAs :sameAs :sameAs 



a) an e-NFA 


b) a reduced NFA 


Example 1. Gonsider the part of a LOD graph database 
as presented in Fig. 1. This represents information the Gun- 
dam robot statue in Odaiba in Tokyo. The data has been 
integrated from two datasets, identified by the prefixes en 
and jp, standing for the English and Japanese Wikipedia, re¬ 
spectfully. The data entities between these two datasets are 
interlinked by using OWL ontology terms. Equivalent enti¬ 
ties are connected with owhsameAs edges. In this case, the 
Japanese dataset contains richer spatial information related 
to the statue than does the English dataset. 

Say a user wants to know in which country this Gun- 
dam statue is located. Since there are no direct JsLocate- 
dln edges outgoing from en:Gundam —as is often the case in 
linked data—the graph needs to be searched. During the 
search, equivalent data entities need to be resolved by fol¬ 
lowing : same As edges. Likewise, a spatial hierarchy needs 
to be computed by following :isLocatedln edges. This search 
can be defined by the following SPARQL query pattern: 

en:Gundam (:sameAs*/:isLocatedln)+ (Qi) 

/:sameAs* ?place . 

Qi computes the spatial hierarchy starting from node 
en:Gundam, using information from both interlinked 
datasets to resolve equivalent entity closures. 

2.2 FA Plans 

Regular expressions are a formal notation for patterns 
that generate strings—called words —over an alphabet The 
set of words that a given regular expression can generate is 
called its language. 

The dual to generation is recognition. Finite state au¬ 
tomata are the recognition counterpart to regular expres¬ 
sions. For any given regular expression, a finite state 
automaton—abbreviated as finite automaton —can be con¬ 
structed that will recognize the words over the alphabet that 
belong to the expression’s language. 

Thus, an FA A can be constructed to recognize the lan¬ 
guage of a given regular expression r. One can construct 
one such FA by traversing the parse tree of r bottom up, 
and combining the automata that recognize sub-expressions 
of r into a composite automaton via union, concatenation, 
and closure of the sub-automata as is appropriate. 


Figure 2: An e-NFA and corres. reduced NFA for Qi. 



Figure 3: Example product construction of automata. 


s and t are nodes in G, the algorithm proceeds as follows. 
The expression r is converted into a hnite automaton Aq by 
using the bottom up traversal of parse tree of r, as discussed. 
Then, the graph database G is converted to finite automaton 
Ag with graph nodes becoming automaton states and graph 
edges becoming transitions. Node x is assigned to be the 
initial state, and y is assigned to be the accepting state in 
Ag- 

Then, given Ag and Aq, a product automaton P = Ag x 
Aq is constructed. P is then tested for non-emptiness, which 
checks whether any accepting state can be reached from the 
initial state. If the language defined by P is not empty, then 
the answer for the reachability query (s, L(r), t) on graph G 
is “yes”; i.e., there exists a path between s and t in G that 
conforms to r. This idea of employing a product automaton 
for RPQ evaluation over graphs has been used in [6,13,14, 
16,17,23]. 

Example 3. Given query Qi and the database G from 
Ex. I, the corresponding product automaton P = Ag x Aq 
is shown in Fig. 3. P is a representation of the search 
space that needs to be explored to answer Qi. P can 
be explored using any search strategy—e.g., breadth-first 
search—starting from the initial state (en:Gundam, go)- All 
reachable accepting states (shown in bold) are the answers 
to Qi. 


Example 2. Recall query Qi from Ex. I. As shown in 
Fig. 2, an automaton construction for this query is a two- 
step procedure. First, traversing the parse-tree of r bottom 
up, the e-NFA is built up, by the base case and the inductive 
rules. Second, the resulting e-NFA is then minimized to an 
NFA, which typically has smaller size, and hence, is more 
efficient to process. 

The first algorithm to use automata to evaluate regular 
expressions on graphs was presented in [16] as a part of an 
implementation of the G+ query language. Given a graph 
database G — (N,E) and a query Q = (s,L(r),t) in which 


2.3 ck-RA Plans 

An alternative approach is to use the a-extended rela¬ 
tional algebra (a-RA) to produce evaluation plans for RPQs. 
The a operator computes the transitive closure of a relation. 
Let the graph database be represented as a relation of triples 
G(s,p, o). Let T = tti^sG; thus T consists of pairs of nodes 
(s,o)such that the pair is connected by a directed edge im 
the graph. Then a applied to T computes the least fixpoint 
of the following operation: 

T~*" = T U 7ri,3(T~*" ^T+ .o=T.s (^i) 
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a) a parse tree b) an a-RA tree 

Figure 4: A parse tree and a-RA tree for query Qi. 



a) an FA plan 



Figure 5: Example plans. 


are incomparable—we consider these plan spaces. The Venn 
diagram of how they are related is shown in Fig. 5c.^ 


Thus, a{T) results in all pairs of nodes such that, for the 
nodes of each pair, there exists a path between them in the 
graph (denoted by) G. If we were to evaluate the fixpoint by 
a semi-naiVe evaluation, each iteration of evaluation is over 
paths of length one greater than of the previous iteration. 
The process stops when no new pairs are added; i.e., the 
fixpoint has been reached. 

Given the SPJRU (select-project-join-rename-union) rela¬ 
tional algebra extended with the a operator, one can evalu¬ 
ate the RPQ Q — (x, L(r), y) over graph G = (V, E) by the 
algorithm proposed in [15]. This traverses the syntax tree 
of expression r bottom-up. Let s be the sub-expression of r 
represented by a given node in a parse tree. The binary rela¬ 
tion Rs G N X N is computed so that node pair (u, v) G Rs 
iff there exists a path from u to v in G that matches s. 

The manner in which the relations are joined going 
bottom-up in a parse tree depends upon the type of the 
node. The cases are as follows: 

1. If s is a E-symbol, then Rs := {(i/, u)|(i^, s, u) G E}. 

2. If s = e, then Rg := {{u,u)\u G N}. 

3. If Si and S 2 are sub-expressions and s = si|s 2 , then 
Rs = Rsl U Rs2- 

4. If Si and S 2 are sub-expressions and s = si • S 2 , then 
Rs = 7Ti^3(Rs1 >^Rs 1 . 2 =Rs 2-1 ^s2. 

5. If s = Si, then Rg is the reflexive and transitive closure 
of Rsl, or Rs = <a(Rsi) U Rgi. 

6. If s = sj^, then Rg is the transitive closure of Rgi, or 

Rs = a(Rsi). 

(Correctness of this algorithm is established in [15]). 

Example 4. Given query Qi and the database G from 
Ex. 1, the corresponding a-RA tree is shown in Eig. 4. 

The a-RA-based RPQ evaluation can be directly imple¬ 
mented in most relational databases and relational triple¬ 
stores. In [21], we proposed a method that translates RPQs 
as defined by SPARQL property paths into recursive SQL. 
A similar approach was used by Dey et al. [8] in the context 
of the evaluation of provenance-aware RPQs by a relational 
engine. 

2.4 Comparing Plan Spaces 

The FA and a-RA approaches each entail a plan space; 
that is, the plans collectively an approach produces over 
all possible property-path queries. Let PpA and P^-RA de¬ 
note the plan spaces for FA and a-RA, respectively. To 
understand how the approaches are related—for instance, 
whether one approach subsumes the other, or whether they 


Claim 1. PpA and Va-RA are incomparable (PpA “ 
'Pa-RA ^ 0 and Va-RA “ ^FA ^ 0), but overlap (Pfa n 
Va-RA ^ 0)- 

Of course, we are taking liberties; the place spaces should 
be over the same domain of plans. As we have presented 
things, however, they are not; we have presented FA plans 
as automata and a-RA plans as algebraic trees. To prove 
formally the claim in Eig. 5c, we would need to establish an 
isomorphism between FA and a-RA plans, or have a canoni¬ 
cal form for plans to which each plan type could be mapped. 
This can be done. The formalism for waveguide plans we will 
present in §3 would suffice for this mapping. Datalog, or 
the relational algebra extended by while loops (established to 
be expressively equivalent to Datalog) [1], would provide 
an even more universal domain that would suffice. 

This is beyond the scope of what we can do here. Still, we 
easily can establish informally that these spaces are distinct. 
Consider the following generic property-path query pattern: 

?x (a/b)+ ?y . (Q 2 ) 

We shall be using Q 2 as a prevalent example. Here, “a” and 
“b” are stand-ins for labels. It matches node-pairs that are 
connected by some path labeled ab, abab, or ababab, and so 
forth. This is a quite simple property-path query, but one 
that already demonstrates the complexities of planning. 

The FA plan in Eig. 5a would be in V^/\ for Q 2 . There is 
no a-RA plan that could be equivalent to it, however; none 
would ever evaluate aba, a baba, and so forth as state 
does in the FA plan. a-RA plans cannot compute transitive 
closure in a pipelined fashion as the FA plan is doing; the a 
operator acts over an entire relation. 

The a-RA plan in Eig. 5b would be in Pq,_ra for 0,2- There 
is no FA plan that could be equivalent to it, however; no 
state transition in its automata can represent the “join” with 
ab. FA plans do not encompass views, materialized parts of 
the query that can be reused, while the a-RA plan does by 
effectively materializing ab to join repeatedly on it. 

Meanwhile, there are many plans in common between FA 
and a-RA: for any query that is restricted to transitive clo¬ 
sure over single labels, for example, will result in common 
FA and a-RA plans. 

3 . WAVEGUIDE PLANS 


^The diagram’s claim that the plan space of waveguide 
plans, VyqQp, properly subsumes both T^pA 'Pa-RA is 
taken up in §3.5. 
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Waveguide’s evaluation strategy is based on an iterative 
seareh algorithm —and variations of it—which is guided by 
the WGPs. We are able to express complex query evalu¬ 
ation plans which involve multiple search wavefronts that 
iteratively explore the graph. The states of the wavefront 
automata in a WGP represent path queries in their own 
right. As the WGP (selectively) materializes states dur¬ 
ing evaluation—which we call path views —this allows wave- 
fronts to re-use intermediate results (paths) that were al¬ 
ready discovered by the search process. 

3.1 Wavefronts 

In Waveguide, we propose a novel strategy to perform 
efficiently path search while simultaneously recognizing the 
path expressions. Waveguide’s input is a graph database 
G and a waveguide plan (WGP) Pq which guides a number 
of search wavefronts that explore the given graph. This 
graph exploration, driven by an iterative search procedure, 
is inspired by the semi-naiVe bottom-up strategy used in 
evaluation of linear recursive expressions based on fixpoint, 
as is done for the a operator for a-RA, described in §2.3. 

The key idea is, given a seed as a start, to expand repeat¬ 
edly the search wavefronts in the graph until no new tuples 
are produced; i.e., we reach fixpoint. Each search wavefront 
is guided by a wavefront automaton^ a finite state machine 
based on non-deterministic finite automata (NFA). This is 
akin to the FA approach discussed in §2.2. Different, though, 
from NFAs which are used as recognizers of regular expres¬ 
sions on strings, wavefront automata introduce a number 
of features directed to evaluation of regular expressions on 
graphs. These include the use of seeds, append and prepend 
transitions, and path views. 

First, we present the iterative procedure used in Wave¬ 
guide that drives the wavefront expansion. Next, we de¬ 
scribe the new types of transitions enabled by the wavefront 
data-structure. Finally, we discuss the interactions between 
different wavefronts guided by a plan, which can be used for 
optimization. 

3.2 Expanding a Wavefront 

Each search wavefront has a seed as its initialization. The 
seed is the set of nodes in the graph from which this wave- 
front begins its search. A seed can be either universal or 
restrieted. A wavefront with a universal seed conducts its 
search effectively starting from every node in a graph. A 
wavefront with a restricted seed is restricted to starting 
search just from those nodes in its seed. (A restricted seed 
will be defined by the results of other wavefronts or by con¬ 
stants used in a query.) Graphically, a seed is represented 
as an incoming edge to starting state of the wavefront. 
We use the label “f/” to denote a universal seed; any other 
label on this edge denotes a restriction placed on the seed, 
thus a restricted seed. 

Given an evaluation plan defined by search wavefronts, 
the graph exploration is performed by an iterative proce¬ 
dure as illustrated in Fig. 6. For example, consider WGP 
Pi that uses a single search wavefront to answer query 
Q = (x, (ab)-\-,y) on graph G as shown in Fig. 7. Let the 
wavefront Wi be constructed by a direct mapping of the 
query’s regular expression into an NFA. 

During the search, intermediate results are kept in a eaehe, 
denoted at iteration i by Gi. This is a collection of tuples 
{u,v, s) for which u and v are nodes in G and 5 is a state 


WaveguideSearch (G, Aq) 

1 ^seed(G); 

2 i ^ 0; 

3 while |Af| > 0 do 

4 ^-seed(Af); 

5 A^i ^crank(Af+i, Af, G, G*, Aq)-, 

6 A,^i reduce(A^i, Af, Gi); 

7 Gi+i ^cache(Af^i, Gi); 

8 i ^ i -\- 1} 

9 done; 

10 return extract (CD; 

Figure 6: Waveguide evaluation procedure. 

in Wi. The newly discovered tuples found in the current 
iteration are denoted by a delta Ai. We use the cache Gi 
and the delta Ai to eliminate intermediate answers we have 
already seen in the search. 

In the first step of the search procedure, all the universal 
seeds are initialized. Specifically, is assigned the set 
of {u,u, Qq) for all u G N;^ Qq is the starting state for all 
wavefront automata with universal seeds. 

Next, we loop over iterative steps. In each iteration, four 
operations are performed seed, crank, reduce, and cache. The 
iteration continues until fixpoint is reached. 

The seed step populates the restricted seeds, according to 
their respective seed conditions. The crank step transitions 
from the previous delta to the current, Af ^ Af^i. For 
each node v in {u,v,s) G Af, for edge {v,a,w) G G and 
graph transition {s,a,t) G W, {u,w,t) is added to Af^i. 
Thus crank advances the search simultaneously in the graph 
and in the automaton. 

To prevent unbounded computation over cyclic graphs, 
the delta is redueed: Af^i is checked against both the pre¬ 
vious delta Af and the cache Gi] tuples that are seen in 
either Af or Gi are removed to produce Af^^. Lastly, the 
cache is updated Gz+i by adding the tuples in the reduced 
delta Af^i to it (Gi). The iteration halts once A^ is empty. 

Recall Wi in this example was produced by directly map¬ 
ping the regular expression r to an NFA. As the NFA is a 
recognizer for r, it can be established by structural induc¬ 
tion that, for any tuple {u,v, s) in the cache (G) such that 
s is an accepting state, the pair of nodes {u, v) must have a 
path between them in the graph that conforms to r. Thus, 
Waveguide produces the correct results. The answer set 
can be then extraeted from the cache by selecting the tuples 
{u,v, s) for all accepting states s of automaton Wi. 

Example 5. Gonsider the wavefront search in Fig. 7 for a 
query with regular expression r = (ab)-\- on graph G. Plan 
Pi uses a single wavefront Wi which is a basic wavefront 
embodying an NFA that recognizes r. 

For each iteration i of the search, cache Si, delta Ai, and 
reduced delta Af are shown. The search stops when all 
newly generated tuples are, in fact, duplicates, due to cycles 
in G. The cache tuples that are in accepting state ^2 (shown 
shaded) are then extracted as the answer set. 

3.3 Guiding a Wavefront 

®This can be optimized to pull just the tuples from the triple 
store that can participate in the first step of any path to an 
answer. We call this first-hop optimization. 
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(b) Expanding the wavefront by appending and using the tuples 
from G (over the graph) vs. from search cache G (over the view). 


Figure 8: Types of transitions used in a wavefront. 


The construction of the NFA forces an order to the query 
evaluation. A “wrong” choice of NFA can lead to an ineffi¬ 
cient evaluation plan. In Waveguide, we aim to minimize 
the search space explored by considering the possible orders 
of graph exploration by search wavefronts. To achieve this, 
we use wavefront automata which can use transitions that 
expand the wavefront in the direction opposite to the direc¬ 
tion of the edges of the graph. 

Gonsider a graph transition (s, /, t) in wavefront W. Edge 
label I has a general form -a or a-, where a is an edge label in 
G. Position of a dot • specifies a direction of a search wave- 
front and denotes prepend (-a) or append (a-) transition. 
Prepend wavefront expands in the opposite direction to the 
edges in the graph. On the other hand, append parameter 
guides a wavefront that expands in the same direction as 
the edges in the graph. The semantics of crank operation on 
prepend and append transitions are illustrated in Fig. 8a. 


Hence, wavefronts enable automaton transitions that ex¬ 
plore the graph in a direction specified by the transition. 
This allows to define a wavefront that can initiate evalu¬ 
ation from any label in the given regular expression and 
iteratively expand by appending or prepending path labels. 
This gives us the power to explore all different expansion 
orders of a single wavefront. 

3.4 Wavefront Interaction 

Often, the search space is constrained even further if sev¬ 
eral wavefronts are employed in the evaluation, each evaluat¬ 
ing parts of a given regular expression. Waveguide enables 
this by defining a number of automata, one for each search 
wavefront. 

Waveguide plans, in addition to transitions over graph 
edge labels, allow transitions over path views, by utizing 
cached result sets produced by other wavefronts. Gonsider 
a transition {s,l, t) in Wi. If a is an edge label in G, then 
this graph transition expands the wavefront by using the tu¬ 
ples from graph G. Otherwise, if a is a state in HA, then 
this view transition expands the wavefront by employing the 
tuples produced by wavefront HA (as illustrated in Fig. 8a). 

These new types of transitions offer powerful choices in 
WGPs for guiding the search. The search can have multiple 
wavefronts originating from different starting points and ex¬ 
panding in different directions. Further, each wavefront can 
employ the cache through transitions over views to avoid 
unnecessary recomputation.® 

Example 6. Gonsider the wavefront search in Fig. 7 for 
a query Q with regular expression r = (a5)+. Pi is a basic 
WGP embodying an NFA that recognizes r. From Pi, we 
can design a more efficient WGP, P 2 : first, compute {ah) 
with wavefront Wi] then use a loop-back view transition 
to compute the closure {ab)-\- (with wavefront HA)- In this 
case, it can be shown that P 2 explores a smaller search space 
in fewer iterations than Pi. 

3.5 The Waveguide Plan Space 

We claim that the space of waveguide plans subsumes that 
of the FA and a-RA approaches, as the Venn diagram in 
Fig. 5 shows (and with the caveats as discussed in §2.4). 

Glaim 2. PwGP properly subsumes the union of Pfa and 
'Pa-RA (^WGP 2 PfA U P^-Ra)- 

That PwGP subsumes each of Pfa and Pa-RA is straightfor¬ 
ward; we devised WGP so that we could express both FA- 
and a-RA- type plans. WGP extends the FA model. WGP 
encompasses a-RA by the addition of views] what the a op¬ 
erator offers, transitive closure over an arbitrary relation, 
can be accomplished by view-labeled transitions in a wave¬ 
guide plan. 

That PwGP properly subsumes the union of Pfa and 
^a-RA means that there is a waveguide plan that corre¬ 
sponds to no FA plan and to no a-RA plan. We have well 
demonstrated that in the discussions above. Any WGP with 
multiple wavefronts and some wavefront with a long loop- 
back is such a plan; FA plans are essentially single wavefront 
by the FA model, and pipelined loop-backs are outside the 
scope of a-RA. Likewise, any WGP, even single wavefront, 
that is “mixed”, that combines views and long loop-backs, 

®This is also known as memoization. 
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corresponds to no FA plan and to no a-RA plan. (In Fig. 12 
on page 8, P 2 with partial loop-caching is such a plan.) Of 
course, these very types of waveguide plans that FA and a- 
RA miss often are the most efficient plans for a given query. 

In §4, we explain why this rich plan space is relevant. In 
§5, we we compare plans for real queries over real graph data 
to establish that this is true in practice, as well. 

4 . PLAN COSTS 

For a given query, of course, there may be many ways 
to guide the search. We summarize a cost framework for 
Waveguide search, search cost factors that can magnify the 
cost (properties of the graph and of resulting pre-paths com¬ 
puted during evaluation), and optimization methods that are 
enabled by WGPs which address the search factors, in turn. 

4.1 Cost Framework 

Recall the three steps in Fig. 6 of the search iteration: 
crank, reduce, and union. Assume that the search completes 
in n iterations. The cost of crank, C^rank? corresponds to the 
total number of edge walks performed. This search size is the 
sum of sizes of the deltas. The cost of reduce, C reduce5 has 
two components: duplicate removal within a delta and for 
the delta against the search cache. Gost of removal against 
the delta is often cheaper, since in can be implemented in¬ 
memory, while checking against the cache, due to its larger 
size, might require implementation on secondary storage, 
therefore increasing its cost. 

The cost of union, Creduce? is associated with search cache 
maintenance procedures (e.g., indexing), and depends on the 
size of the search cache. 

n 

1. C,,ank = 

i=0 

n 

2 . Ueduce = Y(/2(|Ai|) + /3(|a|)) 

i=0 

n 

3. Cunion = Y/4(|a|) 

z=0 

The cost functions /i _4 above are monotone over their pa¬ 
rameters; these simply abstract the actual costs as based 
upon the underlying implementation of Waveguide’s data 
structures and algorithms. 

4.2 Search Cost Factors 

Properties of the graph and of the WGP chosen—so the 
guided search as performed in terms of the pre-paths that 
are computed by the search—will determine the evaluation 
cost. 

Search Cardinalities. The wavefronk or wavefronts^ that we 
choose—as dictated by the wavefronts of the WGP—for the 
search determines the intermediate results (pairs of nodes 
connected by valid pre-paths) that we collect each iteration. 
Just as with different join orders in relational query evalua¬ 
tion, different wavefronts will result in different intermediate 
delta sizes. These intermediate cardinalities can vary widely 
from plan to plan. 

Solution Redundancy. After much deliberation in the re¬ 
search community, the W3G has adopted a non-counting 
semantics for SPARQL property-path queries. Each node 
pair appears at most once in the answer, even if there are 
several paths between the node pair satisfying the given reg¬ 
ular expression. 



B 


a) search cardinality b) solution redundancy c) sub-path sharing 
Figure 9: Types of search cost factors. 

Answer-path redundancy arises from two sources. First, 
in dense graphs, solutions are re-discovered by following con¬ 
forming, yet different paths. Second, nodes are revisited by 
following cycles in the graph. Thus, the same answer pair 
may be discovered repeatedly during evaluation. It is criti¬ 
cal to detect such duplicate solutions early in order to keep 
the search size and search cache small. 

Sub-path Redundancy. In solution redundancy^ an answer 
pair could have multiple paths justifying it. Likewise, the 
paths justifying multiple answer pairs may share significant 
segments (sub-paths) in common. 

This arises, for instance, in dense graphs and with hierar¬ 
chical structures (e.g., is A and locatedin edge labels). Gon- 
sider a query “?p : located I n+ Canada”. Every person located 
in the neighborhood of the Annex in the city of Toronto 
qualifies, since the Annex is located in Toronto which is lo¬ 
cated in Ontario which is located in Ganada. The sub-path 
“Annex :locatedln+ Canada” is shared by the answer path 
for each Annex resident. 

Because we keep only node-pairs (plus state) in the search 
deltas, and not explicitly the paths themselves,^ we may 
walk these sub-paths many times, recomputing “Annex :lo¬ 
cated I n+ Canada” for each Annex resident. 

4.3 Plan Optimizations 

We consider WGP-optimization methods in relation to the 
search cost factors above. 

Choice of Wavefronts. The direction in which we follow 
edges, and where we start in the graph, with respect to 
the regular expression will result in different search cardi¬ 
nalities. Our choice of automata in the WGP dictates the 
wavefront(s). Eor example, consider query Q — (x, (abc)^y) 
and a fragment of a graph shown in Eig. 9a. Since labels 
a, b and c have different cardinalities, different wavefronts 
will have different search size. Consider two plans Pi and 
P 2 that evaluate Q shown in Eig. 10. Pi has a single wave- 
front that explores the graph starting from a, appending b 
and then c. On the other hand, P 2 has a wavefront that 
starts from the low cardinality label 6, appends c and then 
prepends a. Observe that, in this scenario, P 2 results in 
fewer edge walks than Pi. 

To reduce overall search size, we need to choose wavefronts 
that result in fewer edge walks. Wavefronts can be costed 
to estimate their search sizes based on statistics about the 
graph, such as I-gram and 2-gram label frequencies. (Such 
graph statistics can be computed offline for this purpose.) 

Reduce. Waveguide’s evaluation strategy is designed to 
counter solution redundancy. As shown in Eig. 9b, we con- 

^Note this design choice in our evaluation strategy is critical 
for good performance due to solution redundancy! 
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Figure 11: Threading a shared sub-path. 
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Figure 12: Types of loop caching. 


sider several types of redundant solutions based on a path 
which was followed to obtain each solution. Path A is a 
shortest path. Paths B and are of the same length, but 
go through different nodes. Finally, path G' shares some 
nodes with path G, but it is longer due to a cycle. 

In Waveguide, redundancy of candidate solutions is ad¬ 
dressed by removal of duplicates against both cache (cache) 
and delta (delta) by the reduce operation. Assuming BFS 
search strategy, duplicate solutions obtained by following 
paths of the same length (B, B') are removed within a delta. 
On the other hand, duplicates obtained by following paths 
of different lengths (A, B) are removed when delta is com¬ 
pared against a cache. This also includes paths with cycles 
such as G and G' as they also have different length. 

As a further optimization, once a solution seed-target pair 
has been discovered, first-path pruning (fpp) removes the 
seed from further expansion by the search wavefronts. In 
our example, once path A has been discovered and solution 
(x,y) has been obtained, all longer paths (B, B', G, G') are 
never even materialized. 

Threading. To counter sub-path redundancy requires us to 
decompose a query into sub-queries. We call this decompo¬ 
sition threading^ and our WGPs accommodate this. 

Gonsider query Q — (x, (ri/rs/r 2 ), y) where sub-path rs 
is shared among many solutions as shown in Fig. 9c. This 
query can be threaded as follows. First, pre-path ri is com¬ 
puted by wavefront ■ Then, the portion of the regular 
expression that will result in sub-paths that will be shared 
by many answer paths can be computed by a separate wave- 
front Wvs • Here, seeds wavefront Wps which computes 
a shared path for each of the partial solutions produced by 
Wr-i- Finally, the complete path is pieced together by wave- 
front W. 

Such sub-path sharing can be predicted by graph statistics 
to indicate when sub-queries should be considered. 

Partial Caching. Delta results are cached during evaluation 
as we need to check against the cache for redundantly com¬ 
puted pairs. For large intermediate cardinalities, this can be 
a significant cost. However, some of this cost can be negated. 
In particular, not every state in the WGP’s automata needs 
to have its node-pairs cached. Gaching is only needed when 
unbounded redundancy is possible, due to cycles in the wave- 
front automata or in the graph. States without cycles need 
not be cached. 

Loop Caching. Transitions over views in wavefront automata 


allows us to cache and re-use some of the intermediate node 
pairs we encounter during the search. Such named result 
sets are useful in reducing unnecessary re-computation by 
employing an optimization we call loop caching. 

In transitive query Q = (x, (r)+,y), the expression r is 
evaluated repeatedly until no new solutions are found. Loop 
caching rewrites an evaluation plan such that the base r is 
cached either fully or partially to speed up the transitive 
evaluation of (x)+. 

Gonsider three plans Pnc, Ppc and Pfc for query Q = 
(x, (a5c)+, y) shown in Fig. 12. Plan Pnc has no loop caching 
as it evaluates full expression (abc) in a loop. Plan Ppc uses 
a separate wavefront to evaluate (be) first, then these results 
are used in a loop to evaluate transitive (abc)-\-. Finally, plan 
Pfc caches full base expression (a5c), which is then used in 
evaluation of a transitive expression. 

4.4 Cost Analysis 

In this section we analyze the cost of plan optimizations 
that are exclusive to Waveguide approach, such as thread¬ 
ing and loop caching, with relation to the cost model pre¬ 
sented in Section 4.1. 

Costs of threading. Given a plan P with a single wavefront 
which computes a regular expression of a form r = rxjrsjr^-, 
threading rewrites it into a plan Pt with three wavefronts 
Wri, Wvs and VFjoin as described in Section 4.2. Regardless 
of the split of r into ri, rs and r 2 , this optimization requires 
an additional cost of an extra join in VFjoin- If a shared sub¬ 
path rs is accurately identified, then the total reduction of 
number of edge walks in Pt is sufficiently large to offset the 
cost of the extra join. 

A useful graph metric in identifying the threading split is 
a multiplicity ratio of an expression r in graph G, which is 
computed by analyzing the paths in G: 

A.(0,.) = ||, 

where Ss and So is a set of subjects and objects, respec¬ 
tively, connected in G with paths conforming to r. Then, 
M(G^r) > 1, would indicate that, on average, there are 
many subjects connected to a single object in G, while 
A1(G, r) < 1 would indicate that the opposite is true. The 
greater M is, the more subjects are connected to the same 
object, and, hence, more subjects share a path which origi¬ 
nates from this object. 
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|Af||Af||Af||A«||Ap||A3«||AC'||A«|SCs« 
Pnc 12 0 12 10 6 0 6 5 36 15 

Pfc 12 0 12 10 1 0 25 10 

|Af||Af||AC||A«||AC||A«||Af||A«|SC S« 
Pnc 12 0 72 0 36 30 36 0 156 30 

Pfc 12 0 72 0 216 180 300 180 


Figure 13: Lensing. 
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Figure 14: Overview of a prototype system 


Another useful metric is an average length C{G,rs) of a 
path which conforms to in G. The longer the shared sub¬ 
path Vs is, the more potential savings in edge walks can be 
realized by threading split on 

Then, given M and £ for sub-expressions of r in G, the 
identification of an efficient threading split r = rxjrslr^ 
becomes an (A4,£) maximization problem. 

Costs of loop caching. Given a plan with a single wavefront 
which computes closure (r)+ of a regular expression r, loop 
caching rewrites it into a plan in which parts of r are pre¬ 
computed, cached, and then used in an iterative evaluation 
of a closure. For example, consider the differences in eval¬ 
uation of query Q — (ic, (a5c)+,y) with plans Pnc, Ppc and 
Pfc shown in Fig. 12. Pnc defines a single wavefront, which, 
due to absence of transitions over views, can be executed 
pipelined. On the other hand, Ppc and Pfc first compute 
{he) and (a5c), respectively, in separate wavefronts, the re¬ 
sults of which are used in a wavefront which computes the 
final closure. Note that due to shorter cycles in wavefront 
automata in cached plans Ppc and P/c, the total number of 
concatenations performed is smaller than in Pnc- However, 
the cost of each concatenation is different due to different 
sizes of the participating relations. For example, Pnc con¬ 
catenates intermediate paths with a, b and then c, while P/c 
does the same with a single concatenation with cached Gate- 
In fact, depending on cardinalities of \Ga\, \Gb\, |Gc|, \Gbc\ 
\Gabc\, the concatenations performed in any of the above 
plans might become the preferred cheaper alternative. 

Further, the number of pruned tuples in plans with or 
without caching can significantly differ depending on the 
general shape of the graph. For example, consider two basic 
graphs Gi and G 2 as presented in Fig. 13. Both Gi and 
G 2 have the same frequencies of labels a and 6, but are 
different in terms of their shape. Gi exhibits lensing with 
focal points on concatenations b/a, while G 2 has lensing 
in a/h. Intermediate cardinalities of (number of edge 
walks) and (number of pruned tuples) of the Waveguide 
search are presented for plans with (P/c) and without (Pnc) 
loop caching. Observe that loop caching optimization is 
beneficial for search in Gi with 30% and 33% less edge walks 
and pruned tuples, respectively. On the other hand, loop 
caching performs worse in G 2 with 92% and 600% more edge 
walks and pruned tuples, respectively. This can be explained 
by analyzing the edge walks and pruned tuples during the 
concatenation sequence {... /a/b) which is performed in P^c, 
but not in Pfc. In Gi, {... /a/b) computes a large number of 
intermediate tuples most of which are later pruned due to a 
focal point in h/a. Meanwhile, in G 2 , {- - ■ /a/b) first prunes 
many tuples due to a focal point in a/5, hence reducing the 


total number of edge walks performed later in the search. 

Lastly, we consider queries with constants. In pipelined 
plan Pnc, this constant can be pushed to seed condition S 
of its wavefront. In fact, full concatenation (abc) might not 
need to be ever computed in Pnc- On the other hand, plans 
Ppp and Pfc allow at most partial constant pushdown, since 
cached relations must be computed with universal seed to 
ensure completeness of the final closure. 

5 . PERFORMANCE STUDY 

5.1 The Waveguide Prototype 

We have prototyped a Waveguide system that imple¬ 
ments the methodology from §3 in order to benchmark wave¬ 
guide plans to study their performance. In this Waveguide 
system, resource-intensive tasks are delegated to POST- 
GReSQL via SQL and procedural SQL routines. This imple¬ 
mentation of our methodology provides high performance, 
scalability, and rapid deployment. 

Fig. 14 shows the architecture. It consists of two layers: 
application and RDBMS. The application layer provides a 
user front-end, preprocessing the graph data, parsing user 
queries, generating WGPs, and visualizing key steps during 
the search. The RDBMS layer provides postprocessing of 
the graph data and performing the iterative Waveguide 
graph search for the given WGP. 

5.2 Methodology 

We test our implementation of Waveguide by running a 
collection of realistic path queries over real-world datasets 
YAG02s [20] and DBPedia [7]. The datasets were prepro¬ 
cessed by removing invalid and duplicate triples and self¬ 
loops. After preprocessing, YAG02s had 242M triples and 
DBPedia had 463M triples, with 104 and 65K distinct pred¬ 
icates, respectively. This makes these datasets well suited 
for benchmarking of path queries. 

At the time of this paper, we could not find any available 
benchmarks for SPARQL property-path queries. We there¬ 
fore generate path queries based on data patterns we iden¬ 
tified in real-world graphs. The goal of these experiments 
is to verify the gains offered by Waveguide optimizations, 
and show that they correspond to the cost framework (§4.1) 
and analysis (§4.4). 

Our benchmark was executed on a 2xXeon E5-2640v2 
GPU server with 7200RPM HDD running Ubuntu Server 
12.04 x64 and PostgreSQL 9.3. 

5.3 Threading 
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Figure 15: Estimated cost of intermediate concatenations. 
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Figure 18: Threading over YAG02s. 


Figure 16: Tuples pruned w/ & w/o loop caching. 

We benchmark the threading optimization by executing a 
query of the following template pattern 

?x p/:locatedln+/:dealsWith+ ?y (Q 3 ) 

over the YAG02s dataset, with “p” as a variable predicate. 
We chose this template for the following reasons. First, since 
Q 3 contains the concatenation of two transitive closures, it 
is difficult to predict the average length of the paths in the 
answer. Second, located I n+ is a popular predicate which 
also concatenates with many other predicates, so there are 
many candidates for p. Finally, located I n+ has an M value 
of 11.27 which makes it a good candidate for a threading 
split in Q. 

We group p candidates in two sets: the first (queries Ll-5) 
having an M value greater than 10 ; and the second (queries 
L6-14) having an M value less than 1. Each of the queries is 
executed with three different plans: H, a direct evaluation 
with a single wavefront with no threading; Ti performs a 
threading split on predicate p; and T 2 threads on located I n+. 

The relative running times for queries Ll-14 executed 
with plans D (the baseline), Ti, and T 2 are presented in 
Fig. 18. As anticipated, the evaluation of queries in the 
first group is significantly (up to 75%) faster threaded than 
direct, and with Ti being slightly faster than T 2 . This can be 
attributed to that the length of the shared path C is shorter 
in T 2 due to a “later” threading split in the query expression. 
Also as anticipated, queries in the second group show bad 
results for Ti. Indeed, picking a predicate with At < 1 for a 
threading split will generally be bad due to few shared paths. 
On the other hand, the results for T 2 are better than D for 
5 out of 9 queries in this second group. This is explained 
by the lensing effect , which is produced by concatenation 
p/: located I n+, while A4(G,p) < 1 and A4(G,: located I n+) > 
10. Depending on whether Ad (G,p/: located I n+) is greater 
than or less than 1 , threading is either desirable or not, 
respectively. 

5.4 Loop Caching 

We benchmark the loop-caching optimization by execut¬ 
ing a collection of queries of the simple template Q{ah)+ — 
(x, (a5)+, y) (Q 2 ) with two WGPs Pnc and Pfc, which spec¬ 
ify executions of Q with no loop caching and with full loop 
caching, respectively. 


Values for a and b were chosen by iterative pruning of 
predicates appearing in the DBPedia dataset. First, we 
excluded predicates with very high (more than 25M) and 
low (less than 75K) cardinalities. Then, we ran query 
Qabab — {x,(abab),y) and recorded those (a, 5) predicate 
pairs for which the result of Qabab was not empty. DBPedia 
had 1171 such pairs, which indicates a high number of (a5)+ 
paths in this dataset. For each of these pairs, we ran the 
full closure query Q(ab)-\- to obtain its expansion ratio, 


where |Q| denotes the cardinality of a query result. 

Recall that both Pnc and P/c initially evaluate {ab) paths 
in the same way, while the rest of the closure (a5)+ is com¬ 
puted differently. Hence, in order to show the differences 
between these plans, we chose predicate pairs with rexp ^ 1 , 
so that the computation of the rest of the closure constitutes 
the majority of the plan execution time. We identified 38 
such queries by analyzing graph patterns in DBPedia. 

We evaluated each one of these queries with Pnc and Pfc 
plans and recorded the running time, edge walks and pruning 
statisties. Due to widely varying absolute values for these 
statistics across queries, we present their relative percentage 
breakdowns in Fig. 17, as follows. Each query is represented 
by a two-colored bar, which shows the percentage break¬ 
down of statistics values between Pnc and Pfc executions. 
In this way, we present edge walks (in the left chart) and 
running-time execution (in the right chart). We enumerate 
the queries from D 1 to D38 according the ascending sorting 
of the percentage of edge walks performed in the Pnc execu¬ 
tion relative to the Pfc execution. Hence, in query Dl, Pnc 
execution resulted in significantly fewer edge walks relative 
to Pfc execution, with the opposite true for query D38. Fi¬ 
nally, we perform a further breakdown, for each query, of 
the total number of edge walks into the number of tuples 
which were cached, were reduced against the cache, or were 
reduced against the delta. This breakdown is represented 
by different shades of the color associated with Pnc or P/c 
executions, respectively. 

Our first observation is that, in general, the loop caching 
optimization can significantly increase or decrease the to¬ 
tal number of edge walks performed by the search. In our 
benchmark, loop caching resulted in fewer edge walks in 68 % 
of the queries, with almost an order of magnitude reduction, 
in the best case. On the other hand, in 32% of the queries. 
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Figure 17: Edge walks vs. runtime in plans w/ & w/o loop caching in DBPedia. 



a) Search size for different plans 
and pruning types 


b) Redundancy pruning (by type) c) Delta sizes over iterations 
over iterations of P2 


d) Total query time 


Figure 19: Effect of plans on query evaluation. 


loop caching resulted in more edge walks, with a more than 
5X increase, in the worst case. 

Our second observation is that the query running time 
is correlated to the total number of edge walks performed, 
but with some deviations. In queries with bad loop caching 
performance (D1-D8), the running time grows more slowly 
than the number of edge walks. This is due to that, in these 
queries, the majority of edge walks produced duplicate tu¬ 
ples, which were removed against the delta. Such removals 
are inexpensive, as discussed in §4.1. On the other hand, due 
to the lack of delta removals in edge walks, we observe an 
increase in the running time relative to the number of edge 
walks in D17, D20, D24-31, and D33-34. The running 
time for outliers D 8 and D38 is affected by the cost of in¬ 
termediate concatenations performed during the evaluation. 
Simple cost estimates (based on the product of relations) 
for cranks over iterations are presented in Fig. 15. In D 8 , 
\Cah\ ^ \Ca\ and \Cah\ ^ \Ch\, which slows down the con¬ 
catenations in Pic when compared to Pnc- The opposite is 
observed in D38, yielding the advantage to Pfc over Pnc- 

Lastly, we study the effect of leasing by analyzing the 
degree of delta and cache pruning. Fig. 16 plots pruning 
over iterations for the queries which exhibit leasing: D3 and 
D38. Query D3 has M{G,a) = 10.58 and M(G,b) = 0.33, 
which suggests leasing with focal point on the concatenation 
a/b. As discussed in §4.4, this can significantly increase 
amount of pruning for loop caching, which is indeed what 
we observe. On the other hand, D38 has A4(G,a) = 0.07 


and A4(G,b) = 5.34, which suggests leasing with focal point 
on the concatenation b/a. This leasing benefits loop caching 
by decreasing the amount of pruning over iterations, which 
is what we observe. 

5.5 Combined Optimizations 

We illustrate the impact of combining Waveguide opti¬ 
mizations over the example query 

?p :marriedTo/:diedln/:locatedln+/:dealsWith+ USA (Q 4 ) 

over the YAG02s dataset. We instantiate p as follows. 

Pi: single wavefront USA ^ ?p. 

P 2 : single wavefront ?p ^ USA. 

P 3 : two wavefronts 

?p ^ :locatedln+/:dealsWith ^ USA. 

P 4 : P 2 but with a threaded sub-path 

:locatedln+/:dealsWith+ USA. 

Fig. 19a shows the effect of wavefront choice on search car¬ 
dinality. Note the order of magnitude difference between 
the best, P 4 , versus the worst. Pi. The three types of re¬ 
dundancy pruning—cache, delta, and fpp—are illustrated for 
each plan. Fig. 19b plots search size across iterations for P 2 
with pruning; over 40% of tuples are pruned! Fig. 19c plots 
delta sizes over iterations for Pi and P 3 . Note how the selec¬ 
tive search of P 3 is better behaved than the rapid expansion 
of Pi. In Fig. 19d, the total execution time for each plan is 
presented. This demonstrates the significant improvement 
in performance achievable by careful design of the WGP. 
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6. NEXT STEPS & CONCLUSIONS 

Waveguide plans model a rich space of plans for path 
queries which encompass powerful optimization techniques. 
Next steps in this endeavor are as follows. 

1. Benchmark Waveguide against current, prevalent 
SPARQL engines that support property path queries 
(e.g., Jena TDB, Virtuoso, and AllegroGraph). 

2. Build a full-fledged cost-based query optimizer for 
SPARQL 1.1 for property paths (RPQs). 

(a) Define “WGP” systematically to define formally 
the space of WGPs for a given query. 

(b) Devise a concrete cost model for WGPs. 

(c) Determine an array of statistics (e.g., 1-gram and 
2-gram label frequencies) that can be computed 
efficiently offline that can be used in conjunction 
with the cost model. 

(d) Design an enumeration algorithm to walk dynam¬ 
ically the space of WGPs to find the WGP with 
least estimated cost. 

3. Extend the query optimizer to handle queries queries 
with multiple property-paths (equivalent to conjunc¬ 
tive regular path queries). 

Just as new data models necessitate new query languages, 
these new query languages necessitate new approaches if we 
are to evaluate their queries efficiently and effectively. The 
rise of graph databases has necessitated new, powerful query 
languages so that we can make use of them. But we are 
only beginning to uderstand how we can deal effectively with 
these types of queries. 

In this work, we have devised a rich domain of evaluation 
plans for property-path type queries in SPARQL, and have 
shown it extends significantly over the state of the art. We 
have demonstrated that choice of plan can make orders of 
magnitude difference in performance. We have illustrated 
the cost factors behind these plans’s performance and the 
types of optimizations that can be achieved. We have shown 
which plans are effective depends on the underlying graph 
database, which means a cost-based means of choosing plans 
is required. The rise of graph data is well underway. And as 
we learned in the past to do the “impossible” for relational 
data, for semi-structured, for unstructured search, we too 
will meet this challenge. 
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